Cell, Chemical and Anatomical Views of the Gene Ontology: Mapping to a Roche Controlled Vocabulary

نویسندگان

  • David Osumi-Sutherland
  • Enrico Ponta
  • Mélanie Courtot
  • Helen E. Parkinson
  • Laura Badi
چکیده

The Gene Ontology (GO) consists of around 40,000 terms refering to classes of biological process, cell component and gene product activity. It has been used to annotate the functions and locations of several million gene products. Much pharmacological research focuses on understanding how disease conditions differ from physiological conditions in molecular terms with the aim of finding new drug targets for therapy. Gene set enrichment analysis using the GO and its annotations provides a powerful way to assess those differences. Roche has developed a bespoke controlled vocabulary (RCV) to support enrichment analysis. Each term is manually mapped to a list of Gene Ontology (GO) terms. The groupings are tailored to the research aims of Roche and as a result, many groupings are out-of-scope for GO classes. For example, many RCV terms group process and cell parts according to the cell type they occur in. The manual mapping strategy is labour intensive and hard to sustain as the GO evolves. We have automated mappings between RCV and the GO via OWL-EL queries. This is made possible by extensive axiomatisation linking the GO to ontologies of cells, anatomical entites and chemicals. We can fully automate mapping for approximately one third of the terms in the RCV, with another 40% having 10 or fewer GO terms requiring manual mapping. Automated mapping uncovers many missing mappings. GSEA using the resulting, semi-automated mapping of RCV to GO detects enrichment to gene sets missed with the manual-only mapping. The OWL query approach we describe can be used as the basis of new ways to query the GO, group annotations and carry out GSEA. Importantly, it allows the classifications used in enrichment analysis to be much more closely tailored to the needs of researchers and industry than was previously possible.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using OWL reasoning to support the generation of novel gene sets for enrichment analysis

BACKGROUND The Gene Ontology (GO) consists of over 40,000 terms for biological processes, cell components and gene product activities linked into a graph structure by over 90,000 relationships. It has been used to annotate the functions and cellular locations of several million gene products. The graph structure is used by a variety of tools to group annotated genes into sets whose products sha...

متن کامل

Mapping of TP53 protein network using cytoscape software

TP53 acts as a tumor suppressor in cancer. It induces cell cycle arrest or apoptosis in response to cellular stress and damage. p53 gene alteration could cause uncontrolled cell proliferation.In the present study, we used TP53 gene as the seed in the construction of a protein-protein functional association network to identify genes that might involve in tumorgenesis process with TP53. TP53 prot...

متن کامل

The Zebrafish Information Network (ZFIN): the zebrafish model organism database

The Zebrafish Information Network (ZFIN) is a web based community resource that serves as a centralized location for the curation and integration of zebrafish genetic, genomic and developmental data. ZFIN is publicly accessible at http://zfin.org. ZFIN provides an integrated representation of mutants, genes, genetic markers, mapping panels, publications and community contact data. Recent enhanc...

متن کامل

Identification and prioritization genes related to Hypercholesterolemia QTLs using gene ontology and protein interaction networks

Gene identification represents the first step to a better understanding of the physiological role of the underlying protein and disease pathways, which in turn serves as a starting point for developing therapeutic interventions. Familial hypercholesterolemia is a hereditary metabolic disorder characterized by high low-density lipoprotein cholesterol levels. Hypercholesterolemia is a quantitativ...

متن کامل

The Effect of Concept Mapping on Iranian EFL Learners’ Vocabulary Learning and Strategy Use

This study aimed to investigate the effects of concept mapping on the extent to which Iranian EFL learners retain new vocabularies and the degree of awareness toward vocabulary learning strategies they tended to use. To this end, a total of 40 Iranian EFL students were asked to participate in this study. They were randomly assigned to two equal groups; namely, experimental and control. The part...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015